Analysis of multi-strain infection of vaccinated and recovered population through epidemic model: Application to COVID-19

In this work, an innovative multi-strain SV EAIR epidemic model is developed for the study of the spread of a multi-strain infectious disease in a population infected by mutations of the disease. The population is assumed to be completely susceptible to n different variants of the disease, and those who are vaccinated and recovered from a specific strain k (k ≤ n) are immune to previous and present strains j = 1, 2, ⋯, k, but can still be infected by newer emerging strains j = k + 1, k + 2, ⋯, n. The model is designed to simulate the emergence and dissemination of viral strains. All the equilibrium points of the system are calculated and the conditions for existence and global stability of these points are investigated and used to answer the question as to whether it is possible for the population to have an endemic with more than one strain. An interesting result that shows that a strain with a reproduction number greater than one can still die out on the long run if a newer emerging strain has a greater reproduction number is verified numerically. The effect of vaccines on the population is also analyzed and a bound for the herd immunity threshold is calculated. The validity of the work done is verified through numerical simulations by applying the proposed model and strategy to analyze the multi-strains of the COVID-19 virus, in particular, the Delta and the Omicron variants, in the United State.


Introduction
The growing threat of infectious diseases with resistance to drugs and vaccinations, causing large number of deaths worldwide, is a cause for concern to the medical community and the general population. Scientists around the world are working to learn more about such diseases in order to study how likely an emerging variant of the disease can spread more easily than existing original variants. More data and analyses are needed for such study. These analyses will shed more light on the possibility of reinfections in people who already recovered from original strain, and infections in people who are fully vaccinated against original or previous strains. An example of such disease is an emerging virus called the corona virus 2019  virus that has infected and killed millions around the world within a period of two years. The virus was caused by the virus species 'severe acute respiratory syndrome corona virus', named SARS-CoV-2. The airborne transmission occurs by inhaling droplets loaded a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 with SARS-CoV-2 particles that are expelled by infectious people. Symptoms of the virus appear 2-14 days of exposure to certain strains of the virus. Several SARS-CoV-2 variants have emerged around the world, with each new variants having different characteristics. The COVID-19 virus evolves as changes in its genetic code occur during replication of the genome. The United States Centers for Disease Control and Prevention (CDC) and the US government SARS-COV-2 Inter-agency Group (SIG) [1] have confirmed the emergence of at least twelve new variants of the virus. The SIG group evaluates the risk of the circulating SARS-CoV-2 variants in the United States and make recommendations about the impact, severity, and how spread the virus is. Lineages of these variants that are classified as variant being monitored (VBM) and designated as Variant of Concerned (VOC) in the United States may lead to more severe cases of the virus. By lineages, we mean a group of related virus variant from a common ancestor. The variant, called Omicron-B.1.1.529 [2] was first detected in specimens collected on November 11, 2021 in Botswana and on November 14, 2021 in South Africa. It was first detected in the United States on December 1, 2021 and classified as a VOC. There are increased attention given to the Omicron variant. It is now the main circulating variant in the United States as of early January, 2022, accounting for about 95% of the US reported cases because of its high infection rate. Studies show that even vaccinated individuals can still be infected with this new variant if proper care is not taken. Scientist around the world are working to learn more about this new variant, and to gather more data for the purpose of studying if the variant causes more illness, hospitalizations, and death than infection with other variants. As of early December 2021, one of the new variants, called the Delta-B.1.617.2 variant was said to be the main circulating variant in the United States, and also classified as variant of concern by the SIG group. It was first identified in India and was reported to spread much faster and causes more severe cases than other early variants in the United States, probably causing twice [3]  , and Gamma-P.1 variants were first discovered in the United Kingdom, South Africa, and Japan/Brazil, respectively. We aim to study how these variants are being transmitted and the impact that vaccines are having on mitigating the number of infection cases in a particular population.
Several mathematical models  have been developed to study the transmission of infectious diseases. Some of these works [5,11,20,24,31,32,[37][38][39][40][41][42][43] discussed the transmission of infectious disease caused by the variants and lineages of the COVID-19 virus. SIR models including a modified SIR model with two strains and vaccinated group [10], a generalized SIR model with n-strains [39], a SIR model with complete cross-protection and nonlinear force of infection [44], a coupled multi-strain SIR epidemic model [24] have been developed to describe the transmission of the COVID-19 strains. Other works such as a multi-strain SEIR models with optimal control [5,6], multi-strain SEIR models with saturated and general incidence rates [5,15], SIRD [45] and SEIPAHRF model [46] with Caputo fractional derivative, SCIRP model incorporating media influence [47], and statistical analysis [48] have also been considered in describing the transmission of the virus and its strains. We direct the readers to the work of Hattaf et al. [49,50] for more recent information about the fractional differential equations and its generalization.
Viruses undergo changes and these changes are cause for concern for people who have recovered from the virus, and also to those who are vaccinated against certain strain of the virus. As discussed in Fudolig et al. [10], a highly infectious emergent strain can infect the susceptible population before the original strain, thereby impeding the spread of the original strain or causing the two strains to coexist in an endemic equilibrium. For this reason, the need to determine conditions in which a newly emerged strain and an existing strains that have a means of immunity will coexist in a population is of utmost important.
Most of the papers mentioned above often utilize simpler versions of the multi-strain network, and provides less mathematical details on the asymptotic behavior of the model. In this work, we develop a multi-strain compartmental model by assuming a population is completely susceptible to n-different variants of a particular virus at the beginning of an epidemic, with the population of the region partitioned into compartments consisting of susceptible population, population vaccinated against strain k (1 � k � n) of the virus, population exposed to strain k of the virus, infected asymptomatic population with strain k of the virus, infected symptomatic population with strain k of the virus, and population that recovered from strain k of the infection. By denoting P = {1, 2, � � �, n} and S r 2 2 P as a subset of the power set 2 P with r number of strains, r = 0, 1, � � �, n, with r = 0 representing disease-free case, we study the existence and stability conditions for the equilibrium point corresponding to the scenario where only strains in S r survive. This result is used to estimate the secondary number of infections produced by strain-k infected individual when introduced into a completely susceptible population. This number helps public health expert control the spread of the virus as new variants emerge. Conditions under which an endemic with more than one strain of the virus exist are calculated. The question as to whether this is possible is answered by studying the stability analysis of endemic strains. This model also helps us to understand the correlation between the daily number of administered vaccines and the number of infected, exposed, and recovered population in the region better. To understand the disease dynamics better, we study a powerful quantitative concept that can be used to characterize the contagiousness of each strain of the infectious disease and how transmissible they are. The basic reproduction number, which is the expected number of secondary cases produced by a typical infectious individual in a completely susceptible population in the presence and absence of vaccination are calculated. This study also helps to shed more light on the possibility of a disease being eliminated from a population if enough individuals are immune due to either vaccination or recovery from prior exposure to the disease. A bound for the herd immunity threshold, which is the minimum proportion of the population that must be vaccinated in order to stop the spreading of the disease in the population is calculated and analyzed for each variants.
This paper is organized as follows: In Section 2, a model is developed for the transmission of multi strain infectious diseases for the case where individuals vaccinated against specific strains are immune to that strain and its predecessors but can still be infected by newer emerging strains.The validity of the model, together with the existence and uniqueness of its solution is proved in this section. The reproduction numbers for the cases where the population is vaccinated and when it is not vaccinated are calculated. With these, the effect of vaccination in mitigating infection is studied. Existence of equilibrium points for the case where certain strains persist in the population is examined. In Section 3, the local and global stability of all equilibrium points for model (1) is investigated. In addition to the assumption made in Section 2, an extension of model (1) for the case where those that recovered from a particular strain can get infected by emerging strain is derived and analyzed in Section 4. Existence and stability of equilibrium points for the model is also discussed. Numerical simulations are carried out in Section 5 by applying models (1) and (35) to analyze COVID-19 data. Summary of the work done in this work is discussed in Section 6. strain k of the virus, population E k (t) of exposed individuals infected with strain/variant k of the virus, infected asymptomatic population A k (t) with strain/variant k of the virus, infected symptomatic population I k (t) with strain/variant k of the virus, and population R k (t) that recovered from strain k of the virus at a given time t, for k = 1, 2, � � �, n. These state variables are described in Table 1. We assume that the order of existence of newer strains follow k = 1, 2, � � �, n. That is, the population is first infected by strain 1, followed by strain 2, � � �, n. We assume those who are vaccinated and recovered from strain k are immune to previous and present strains j = 1, 2, � � �, k, but can be infected by newer strains j = k + 1, k + 2, � � �, n. We can see, for example, that this assumption is satisfied in the case of COVID-19 and supported by the United States Centers for Disease Control and Prevention (CDC), who claimed that current vaccines approved in the United States are expected to protect against severe illness, hospitalizations, and deaths caused by variants of the virus, although people who are fully vaccinated can still be infected. We assume that the migration rate into the susceptible population not receiving vaccination against any of the strains of the virus is (1 − q)μ, while the migration rate into the population vaccinated against strain k is μq k , where q ¼ P n k¼1 q k < 1. The parameter β j denotes the per capita contact rate of susceptible individuals with symptomatic infected strain j; γ j denotes the per capita contact rate of susceptible individuals with asymptomatic infected strain j; � b j;k and � g j;k denote the reduced per capita contact rates of symptomatic and asymptomatic strain j infected individual, respectively, that are vaccinated against strain k, j = k + 1, � � �, n; h j, k and � j, k denote the per capita contact rates of symptomatic and asymptomatic strain j infected individual, respectively, that recovered from strain k, j = k + 1, � � �, n. These parameters are described in detail in Table 2.
In this section, we study a case where individuals vaccinated against specific strains are immune to that strain and its predecessors but can still be infected by newer emerging strains. Without loss of generality, we assume the population is normalized so that the sizes S, V k , E k , A k , I k , and R k are in percentages, for k = 1, 2, � � �, n. The model governing the transmission of the infectious disease for this case follows the system of deterministic differential equation where and the states and parameters in the model are described in Tables 1 and 2, respectively.. An extension of model (1) to include a case where those that recovered from a particular strain can get infected by emerging strain is discussed in Section 4. In order to better understand the transmission dynamics described in (1), we give a schematic diagram of the model in Fig 1. The circle compartments represent group of individuals. An arrow pointing out of a compartment represents migration of individuals out of the compartment.
Remark 1. Since individuals in the vaccinated group V j are only assumed to be immune to strain j and its predecessors, and not to future strains k > j, we assume they are also susceptible to future strains k > j. For this reason, we expect that the rate at which an infectious individual with strain k make contact with susceptible individuals and individuals in group V j , j < k, should be the same. That is, b k ¼ � b k;j for j = 1, 2, � � �, k − 1. Likewise, we expect β k = h k, j , Vaccination against past and current strains do not provide any protection against new emerging strains. In the case where the infectivity of strain k (k > j) is different for the vaccinated group V j due to certain circumstances so that b k 6 ¼ � b k;j and g k 6 ¼ � g k;j , we present a comparison of the reproduction number for the case where b k ¼ � b k;j , g k ¼ � g k;j and the case where b k > � b k;j , g k > � g k;j for j = 1, 2, � � �, k − 1 in Remark 3. Also, based on model (1), an individual infected with certain strain k cannot be infected with another strain j 6 ¼ k at a given time t.

Validity of the epidemic model (1)
In this section, we discuss the validity of the proposed epidemic model (1), that is, we discuss the existence and uniqueness of the solution of (1).
Since the population is assumed to be normalized, we have A k0 þ I k0 þ R k0 Þ ¼ 1 denoting the entire population. It follows that dN/dt = μ − μN, N(t 0 ) = 1, so that the population size is constant over time. This suggests that the birth and death rates (with the recruitment rate into the susceptible S and vaccinated V k classes in the absence and presence of vaccines being (1 − q)μ and q k μ, respectively, k = 1, 2, � � �, n) are assumed to be the same so that the population is constant over certain period of time. The most appropriate epidemiological feasible region where solution exist is the set The existence, uniqueness, and positiveness of the solution of (1) is shown for the case where S 0 > 0, V k0 > 0, E k0 > 0, A k0 > 0, I k0 > 0, R k0 > 0 using results from Kelley and Peterson [51].
The long term behavior of the solutions of model (1) depends on certain thresholds of the reproduction number. The reproduction number and the thresholds are calculated in sections to come.

Reproduction number
Define The disease-free equilibrium, denoted E 0 , of the system (1) is given by The dynamic model (1) can be written in the form where F j denotes the rate of appearance of new infections in compartment j, and V À j denoting the rate of transfer of individuals in and out of compartment j, respectively [52]. For any vector u = (u 1 � � �, u n ) T , define the n × n matrix M(u) by the diagonal matrix Define where q 0 = 0. Letb andg be two vectors with entries respectively, whereb 1 ¼ ð1 À qÞb 1 andg 1 ¼ ð1 À qÞg 1 . Define F and V such that where (i, j) with 1 � i, j � 3n corresponds to the index of the infected compartments. Based on our model (1), there are three infected compartments E, A, and I, each assumed to have n-different strains so that the infected compartments are the first 3n entries of x in (8). Let 0 n × n represents the zero square matrix of order n. The corresponding matrices F and V are calculated as where M(u) is as defined in (9) where M(e) −1 is the diagonal matrix MðeÞ , we see that the average length of time an individual spent being exposed, asymptomatically infected, symptomatically infected with strain k is 1/e k , 1/a k , and 1/w k , respectively. Also, the average length of time an individual exposed to strain k spent being asymptomatic with the same strain during its life time is expected to be pw k l k a k e k w k . The average length of time an individual exposed to strain k spent being symptomatic with strain k during its life time is expected to be ð1À pÞa k l k a k e k w k . Since the transmission rates of symptomatic and asymptomatic individuals with strain k in a susceptible population are β k and γ k , respectively, it follows that at the emergence of a new strain k (k � 2), the expected number of new infections produced by individual with such strain in a completely susceptible non-vaccinated population, and a population completely vaccinated against strain j, are ð1À qÞðð1À pÞa k b k þpw k g k Þl k a k e k w k ¼ ð1À qÞc k l k a k e k w k and a k e k w k , respectively, j = 1, 2, � � �, n, where we set q 0 = 0.
The expected number, R k , of new infections produced by individual with such strain k in a population containing susceptible and vaccinated individuals is obtained as where 1 − q is the proportion of those that are susceptible but not vaccinated. Using the next generation matrix [52], since F is non-negative and V is a non-singular M-matrix [53], the number R k obtained in (11) is the (k, k)-th entry of the next generation matrix FV −1 , for k = 1, 2, � � �, n. It is the expected number of new infections in compartment k (compartment exposed to strain k) produced by infected individual originally introduced into the same compartment.

Remark 2. Effect of Vaccination
We remark here that the expected number R k of new infections in the compartment exposed to strain k depends on the vaccination rates q l , l = k, k + 1, � � �, n. If no one is vaccinated in the population (that is, if q l = 0 for all l = 1, 2, � � �, n), then the expected number of infections caused by strain k is obtained to be It follows from Remark 1 that this number is clearly more than the reproduction number obtained in (11) where some individuals are receiving vaccination in the population. The reproduction number R k can be written in terms R c;k as showing that the ratio of numbers of infection caused by strain k in a population with vaccination to a population without vaccination is 1 − q + Q k : 1. If individuals are not vaccinated against strain k (that is, if q k = 0 for fixed k), then the expected number of infections caused by strain k increases to which is the same as R k þ q k c k l k a k e k w k . This shows that a non vaccinated strain k infected individual produces on the other hand, everyone in the population is vaccinated (that is, q = 1), then the expected number of infections caused by strain k significantly reduced to That is, a completely vaccinated population produces 1À q 1À qþQ k � 100% lesser infections than population not completely vaccinated. These analyses show the importance of being vaccinated in the population.
The reproduction number R 0 for the system, which is defined as the expected number of secondary infections produced by a typical infected individuals over the course of its infectious period, is calculated using the next generation matrix approach by van den Diessche et al. [52] as We shall later show that if R 0 < 1, then we expect a typical infected individual to produce less than one new infected individual, meaning all strains of the disease will eventually die out in the population (in the presence of vaccination). Although it is well known that a value of the reproduction number greater than one means that epidemic will persist in the population [8,52], we shall later show that a strain with a reproduction number greater than 1 can still die out on the long run if a newer emerging strain has a greater reproduction number. The value R k on the other hand, can be interpreted as the reproduction number for typical individual infected with strain k of the disease, for k = 1, 2, � � �, n. Remark 3. In the case where b k 6 ¼ � b k;j and g k 6 ¼ � g k;j such that b k > � b k;j and g k > � g k;j for all 1 � k, j � n due to some form of partial immunity as a result of vaccines, then the strain k reproduction number R k and the reproduction number R 0 in (11) and (14), respectively, are obtained as and we setĉ k;0 ¼ 0 8 k = 1, 2, � � �, n. It is easy to show that c k >ĉ k;j in this case, so that the estimates of the reproduction numbers given in (15) are greater than those given in (11) and (14), respectively. This shows that the reproduction number reduces as vaccines reduce the infectivity of the viruses (as expected). Remark 4. The reproduction number R 0 is for the case where some members of the population are vaccinated against certain strain of the disease, that is, case with compartments V 1 , V 2 , � � �, V n in the population. As explained in Remark 2, the reproduction number, denoted R c , for a completely susceptible non-vaccinated population is given by

Existence of equilibrium points
We discuss conditions under which equilibrium points of (1) exist. System (1) has many equilibrium points. Let P = {1, 2, � � �, n} denotes set of indices representing order of existence of new strain in the population, with 2 P denoting the power set of P. Let S r 2 2 P denotes a subset of 2 P with r number of strains, r = 0, 1, � � �, n, with r = 0 representing disease-free case. We study the existence conditions for the equilibrium point corresponding to the scenario where only strains in S r survives, r = 0, 1, � � �, n. The case r = 0 exists for the disease-free equilibrium E 0 case. The disease-free equilibrium is given in (6). For the case r = 1 representing case where only one strain survives, we shall denote the strain by strain m, m = 1, 2, � � �, n, and the equilibrium referred to as the strain m equilibrium point and denoted E m . We study the conditions under which such equilibrium point exists, and in general, we also study conditions under which strains with indices in S r survives, for r = 1, 2, � � �, n. Theorem 2. The strain m unique equilibrium point, denoted E m , for the epidemic model (1) exists in the feasible region T provided R m > 1.
Proof. The strain m equilibrium point E m is obtained by solving the equation h(t) = 0 and setting E k = A k = I k = R k = 0 for 1 � k 6 ¼ m � n, where h(t) is a vector in (4) containing the right hand side of (1). It follows from the equation governing A m and I m that A m ¼ pl m a m E m and I m ¼ ð1À pÞl m w m E m . Substituting these into the equations governing S, V k , k = 1, 2, � � �, n, E m , and where c m and � c m are defined in (5). If R m > 1, the strain m equilibrium point E m is obtained as

Remark 5. Theorem 2 shows that strain m alone persists at equilibrium if the expected number of secondary infections produced by strain m infected individual is greater than one. We shall later show in Theorem 8 that this equilibrium point is stable globally if
and R m > 1. Theorem 2 is also valid for the case where b k > � b k;j and g k > � g k;j , j = 1, 2, � � �, k − 1. The proof of this is shown in Theorems 17 and 18.
We denote the equilibrium point corresponding to the case where strains τ 1 , τ 2 survive by E S 2 . We give theorem under which the equilibrium E S 2 exists. Theorem 3. The epidemic model (1) has an equilibrium point E S 2 in the feasible region T provided Proof. Condition (20) implies that R t 1 > R t 2 > 1. The equilibrium point E S 2 is obtained as

PLOS ONE
Analysis of multi-strain epidemic model: Application to COVID-19 The result follows. Remark 6 Since R t 1 is non-negative, condition (20) implies that R t 1 > R t 2 > 1. This seems to suggest that the system is already in endemic state with strain τ 1 before the emergence of strain τ 2 caused an endemic. This is also confirmed in the work of Fudolig et al [10] where an SVIR model is used to analyze the transmission of the multistrains of the COVID-19 virus. Condition (20) is equivalent to We analyze the condition geometrically in Appendix B in S1 Appendix. It shows that for only strains τ 1 and τ 2 to remain in the system on the long run, the number of infection R t 2 produced by strain τ 2 -infected individuals must be more than bR t 1 aþðbÀ aÞR t 1 but not up to the number R t 1 produced by strain τ 1 -infected individuals. Once R t 2 falls outside this region, then only strain τ 1 or τ 2 remains in the system on the long run. For instance, showing that only strain τ 1 remains on the long run. In the same sense, we can show that only one strain remains in the system on the long run if R t 2 > R t 1 > 1. This endemic region is shown graphically in Appendix B in S1 Appendix.
Theorem 4. The epidemic model (1) has an equilibrium point E S r in the feasible region T provided Proof. Condition (24) implies that R t 1 > R t 2 . . . > R t r > 1. The equilibrium point E S r is obtained as and for k = 2, 3, � � �, r − 1, That is, the system is already in endemic state with strain τ i before the emergence of strain τ i + 1 , i = 1, 2, � � �, r − 1, caused an endemic. Theorem 4 shows that if only strains τ 1 , τ 2 , � � �, τ r survive, then the system (without these strains) must converge to the disease-free equilibrium state.
2.4.1 Endemic equilibrium. The result for the endemic equilibrium can be calculated from Theorem 4 by extending the set S r to S n . We state the result without proof in the next theorem.

Theorem 5. The epidemic model (1) has an endemic equilibrium point E S n given by
in the feasible region T provided ; for k ¼ 2; 3; � � � ; n À 1; In the next section, we discuss the convergence of the systemỹ in (4). We study conditions under which all strain infections are eradicated in the population on the long run. We also discuss the condition under which certain strain of the disease persists.

Stability analysis
In this section, we discuss the convergence of system (1) under certain conditions. That is, we study condition(s) under which the system converges to disease-free E 0 or equilibriums E S r , r = 1, 2, � � �, n.

Stability analysis of the disease-free equilibrium point
Theorem 6. The disease-free equilibrium E 0 is locally asymptotically stable in the feasible region T if R 0 < 1.
Given the initial conditionỹ 0 2 T , Theorem 6 shows that if the systemỹ satisfying (1) starts near the initial pointỹ 0 , then the system converges to the equilibrium point E 0 (that is, all the strain of infections are eradicated) provided the threshold R 0 < 1. We use the idea presented in [27] to prove Theorem 6.
Proof. Let I n�n be the n × n identity matrix, y ¼ỹ À E 0 , β = {β 1 , � � �, β n }, γ = {γ 1 , � � �, γ n }, the matrices U q ðbÞ and U q ðgÞ defined by where M is defined in (9). The linearization of the model (1) at the equilibrium point E 0 is derived as In order to show that the equilibrium point E 0 is locally stable, we need to show that the maximum real part, s(A), of A is negative. From the structure of the matrix A, this reduces to showing that the maximum real part of the eigenvalues of matrix À A is negative (or equivalently, If R 0 < 1, then detðAÞ > 0. Matrix A is a non-singular Z matrix that can be written in the form where L and U are lower and upper diagonal matrices, respectively, with positive diagonals obtained as :::; 3n; and 0 elsewhere; a l e l w l ð1 À R l Þ: The diagonals L j;j ¼ 1 and U i;j ¼ D j =D jÀ 1 > 0 if R 0 < 1. Therefore, the matrix A is a non-singular M matrix, and hence, a P-matrix. It follows from Berman [54] and Plemmons [53] that the maximum real part of the eigenvalues of À A is negative.
Remark 8. Theorem 6 shows that if the population starts sufficiently close to � y 0 , then the system � y converges to E 0 if R 0 < 1. That is, the proportion of the susceptible size converges to 1 − q, the proportion of the size of the vaccinated class immune to strain k converges to the vaccination rate q k of individuals with strain k of infection, and no infected class, hence no need of recovery on the long run if R 0 < 1.
We prove in the next theorem that the population, irrespective of where it starts from, converges to the point E 0 if R 0 � 1.
Theorem 7. The disease-free equilibrium E 0 is globally stable in the feasible region T if R 0 � 1.
Proof. Define the Lyapunov function V by using (6). Using the fact that the arithmetic mean of a list of non-negative real numbers is greater than or equal to the geometric mean of the same list [55], it follows that if V=dt ¼ 0, its global stability follows by the LaSalle's Invariance Principle [56] Remark 9. Theorem 7 shows that the susceptible class converges to a fraction 1 − q of the entire population size (this is simply the population size without the vaccinated group), the vaccinated class immune to strain k converges to a fraction q k of population size, and no infected class, hence no need of recovery on the long run if R 0 � 1. This suggests the threshold for disease eradication is the number R 0 .

Bound for the critical vaccination threshold
Herd immunity is a state where significant proportion of the population is immune to an infection so that only few susceptible individuals can be infected and transmit the infection [57]. Classical vaccine-induced herd-immunity threshold suggests that the spread of a disease can be stopped by vaccinating certain fraction of the population. This might be invalid and biased due to many factors such as emergence of multi variant strains of the virus/disease, the dynamic nature of virus transmission, presence of immunity due to infection, changes in implementation and adherence to public health measures, and uncertainties in vaccine effectiveness and duration of immunity [57,41]. These can cause the estimation of the Herd-immunity threshold to be imprecise. In this section, we aim to estimate a bound for the minimum proportion of the population that must be vaccinated in order for certain infection to die out in the population.
Let E j 2 (0, 1] denotes the proportion of vaccinated individuals against strain j who are protected by vaccines, for j = 1, 2, � � �, n. It follows from Remark 4 that we can write R k in (14) in terms of R c;k in (17) as follows: for some constant c j 2 (0, 1). If the vaccine is perfect and provides 100% immunity, then the vaccine effectiveness E j = 1, otherwise, E j < 1 if it provides only partial immunity. According to Theorem 7, the population reaches herd immunity with respect to strain j, with incidence of the infection declining, if R j � 1. This yield The number is the minimum proportion of a population infected that must be vaccinated to stop strain j from spreading. This number is referred to as the herd immunity threshold [41]. In the presence of multiple vaccines for strains j = 1, 2, � � �, n with effectiveness E 1 , E 2 , � � �, E n , respectively, let H m and H M denote the minimum min 1�j�n H j and maximum max 1�j�n H j herd immunity thresholds for the strains j = 1, 2, � � �, n, respectively. The interval contains the maximum proportion of the population expected to be vaccinated to stop the spread of the disease.

Remark 10. Caveats to the estimate
We note here that the bound in (33) is calculated for a population satisfying the dynamics given in (1). The bound is expected to change in the emergence of a new strain of the virus. Also, the bound is calculated without taking into consideration the proportion of those who have immunity from the virus. These and more are some of the caveats to this estimates.
Sometimes new variants emerge and disappear. Other times, new variants persist. We study conditions under which new variant, say strain m, persists in the population. In the next section, we study the behavior of the system in the case where disease is not completely eradicated in the system.

Stability analysis of strain m equilibrium
In the next theorem, we study how the population behaves on the long run if the number, R m , of new infections produced by infected individual with strain m is greater than one, while R k � 1 for 1 � k 6 ¼ m � n. Theorem 8. The strain m equilibrium E m for the epidemic model (1) is globally stable in the feasible region T if R k � 1 for all 1 � k 6 ¼ m � n and R m > 1.
Proof. According to Theorem 2, the strain m equilibrium E m exists and non-negative if R m > 1. Define the Lyapunov function V by for all k ¼ 1; 2; � � � ; n; using (19). Using the fact that the arithmetic mean of a list of non-negative real numbers is greater than or equal to the geometric mean of the same list [56], it follows that dV=dt � 0.
Since E m is the largest invariant set in the subset of T where dV=dt ¼ 0, its global stability follows by the LaSalle's Invariance Principle [56].
Remark 11. Theorem 8 shows that the system � y converges to E m on the long run if R k � 1 for 1 � k 6 ¼ m � n and R m > 1 irrespective of the starting point. That is, if the number, R m , of new infections is greater than one while R k � 1 for 1 � k 6 ¼ m � n, then on the long run, the susceptible class converges to a fraction S � ¼ ð1À qÞ R m < 1 À q of the entire population size, no exposure to any strain other than strain m in the population (hence no infection other than those caused by strain m, and no need for recovery), and before the existence of strain m (that is, k < m), the vaccinated class immune to strain k < m converges to a fraction V � k ¼ q k R m < q k of the population size. We see from Theorem 7 that if there is no endemic in the population, the susceptible population S converges on the long run to (1 − q) × 100% of the population. This number reduces by

Stability analysis of the equilibrium point E S 2
We give the proof of the stability of the equilibrium point E S 2 in the next theorem. Theorem 9. The equilibrium point E S 2 is globally stable in the feasible region T if R k < 1 for 1 � k 6 ¼ τ 1 , τ 2 � n and condition (20) is satisfied.
Proof. Assume R k � 1 for all k = 2 τ 1 , τ 2 and condition (20) is satisfied. The existence of the equilibrium point E S 2 follows from Theorem 3. Define the Lyapunov function V 2 by with φ 0 = 0. The derivative of V 2 computed along the solution of (1) is It follows from (34) and Remark 6 that R t 1 > R t 2 > 1 and We have using (22). It follows that dV 2 =dt � 0. Equality holds if S = S + , Since E S 2 is the largest invariant set in the subset of T where dV 2 =dt ¼ 0, its global stability follows by the LaSalle's Invariance Principle [56].

Stability analysis of the equilibrium point E S r
Theorem 10. For r = 3, 4, � � �, n, the equilibrium point E S r is globally stable in the feasible region T if R k � 1 for all k = 2 S r and condition (24) is satisfied. Proof. The proof of Theorem 10 is similar to that of Theorem 9 by extending S 2 to S r , r = 3, 4, � � �, n.
Remark 12. Theorem 10 can be extended to a case where S r ¼ S n .

Model for re-infected recovered and vaccinated population
There have been confirmed cases of the COVID-19 reinfections around the world [38,40,42]. In this section, in addition to assuming that individuals vaccinated against strain k can gets infected with emerging strains j > k, we also discuss the case where individuals who recovered from strain k can be infected with emerging strains j > k. For this additional assumption, we extend (1) to the form The schematic diagram of model (35) is given in Fig 2.

Validity of the epidemic model (35)
In this section, we discuss the validity of the proposed epidemic model (35). Theorem 11. If S 0 > 0, V k0 > 0, E k0 > 0, A k0 > 0, I k0 > 0, R k0 > 0, then there exist a positive unique solution of (35) in the feasible region T for all t � 0.
Proof. The proof is similar to Theorem 1.

Existence of equilibrium points for model (35)
The disease-free equilibrium of the system (35) is the same as that of the system (1), and given by

Reproduction number for model (35)
It can also be shown in a similar manner that model (35) has the same reproduction numbers R k and R 0 as that of model (1). This shows that the number of infection in a population where every individual who recovered from strain k is immune to all possible strains is the same for the population where individuals who recovered from strain k can still be infected with emerging strains j > k. Hence, for strain k infected individual, the expected number, R k of new infections produced by individual with strain k in a susceptible population satisfying (35) is the same as (11). Likewise, the reproduction number R 0 for the system (35) is obtained in a similar manner to (14).

Existence of endemic equilibrium points for model (35)
Theorem 12. The strain m unique equilibrium point E m for the epidemic model (35) exists in the feasible region T provided R m > 1. The value of E m is the same for models (1) and (35) and obtained in Theorem (2). Remark 13. Theorem 12 shows that the existence of a single strain endemic (strain m in this case) does not depend on whether the recovered class is immune to the strain or not. Regardless of the immunity status of the recovered strain k-class to the strain, the equilibrium point of the system will be E m if R m > 1. Define It can be shown that � R t k < 1 À q þ Q k . Conditions for existence of other equilibrium points are given in Theorem 13. Theorem 13. The epidemic model (35) has an equilibrium point E S 2 satisfying is satisfied. Proof. The proof is similar to that of Theorem 4. Remark 14 Unlike Remark 13, we see here that the existence of two endemic equilibrium points depends on the immunity status of the recovered class. The exposed and infectious equilibrium points E þ t k , A þ t k , and I þ t k for model (35) are greater compared to that of model (1). This is because unlike model (1), model (35) suggests that the recovered population R k is not fully immune to emerging strains j > k, increasing the possibility of populating the exposed and infected classes. In addition, condition (39) implies that R t 1 > R t 2 > 1. The condition is equiva-

Global analysis of equilibrium points for model (35) Theorem 14
The disease-free equilibrium E 0 for model (35) is locally asymptotically stable in the feasible region T if R 0 < 1 and globally stable in the feasible region if R 0 � 1. Proof. The proof is similar to that given in Theorems 6 and 7. Theorem 15. The strain m equilibrium E m for the epidemic model (35) is globally stable in the feasible region T if R k � 1 for all 1 � k 6 ¼ m � n and R m > 1.
Proof. The proof is similar to the proof of Theorem 8. Theorem 16. The equilibrium point E S 2 for model (35) is globally stable in the feasible region T if R k < 1 for 1 � k 6 ¼ τ 1 , τ 2 � n and condition (39) is satisfied.
Proof. The proof is similar to the proof of Theorem 9.

Results: Covid-19 data analysis
We apply models (1) and (35) to analyze the United States daily COVID-19 cases (number of infection cases, recovery cases, and vaccination). The daily COVID-19 cases data are available on the CDC website [58] for the COVID-19 periods 04/01/2020 till present. We analyze the confirmed COVID-19 infection cases, and vaccination cases as reported by U.S. states, U.S. territories, New York City, and the District of Columbia from the previous day. Since the two recent variant of concerns (VOC) in the United States are the Delta and Omicron variants, we consider the case where n = 2 using model (35), with the Delta variant as Strain τ 1 = 1 and the Omicron variant as strain τ 2 = 2. Two analyses are performed in this section. The first analysis is shown in Section 5.1 to confirm the validity of the results derived in this work. Using published and estimated COVID-19 parameters, we confirm the stability results for the diseasefree equilibrium, strain 1 equilibrium, strain 2 equilibrium, and the endemic equilibrium E S 2 . In section 5.2, the real COVID-19 cases for the United States is analyzed using model (35).

Simulation results using published and estimated parameters
Using model (35), we confirm the existence, and stability of the disease-free equilibrium, strain 1 equilibrium, strain 2 equilibrium, and the endemic equilibrium E S 2 for the case where strains 1 and 2 represent the Delta and Omicron variants, respectively. The CDC data for vaccination shows that about 63% of the population of the United States are fully vaccinated (either taken the two dozes of Pfizer or Moderna, or the single dose of Jannsen) as at January 25, 2022. For this reason, we chose q 1 and q 2 to be in the interval [0, 0.63]. In their paper, Saldana [59] shows that about 80% of infection was asymptomatic with average incubation and recovery time of 5 and 10 days, respectively. Also, Hay et al. [37] shows that the mean duration of Delta and Omicron's infections is 10.9 days and 9.87 days, with 95% confidence intervals (8.83, 10.9) and (9.41, 12.4), respectively. For this reason, we set r 1 , θ 1 2 [1/10.9, 1/8.83] and r 2 , θ 2 2 [1/12.4, 1/ 9.41]. Based on the five COVID-19 Pandemic Planning Scenarios estimated by CDC, the number of infections that are asymptomatic is uncertain and in the interval [0.15, 0.7], with the best estimate of 30%. We set μ = 0.0124. Bernal et al. [60] shows in their studies that the effectiveness of two doses of BNT162b2 (Pfizer) vaccines was 88.0% (95% CI, 85.3 to 90.1) among those with the delta variant. On November 16, 2020, the company Moderna announced that their vaccine is more than 94% effective at preventing COVID-19, based on an analysis of 95 cases [61]. We use these estimates for the Herd immunity plot. Following results from Jing et al, we select values for the incubation period to be between 2 days and 7 days, so that λ 1 , λ 2 2 [1/7, 1/2]. The range of parameters used in the simulation is shown in Table 3.
Simulation result for the case where the population is free of disease on the long run is shown in Fig 3.  Fig 8 shows the simulation result for the case where R 2 > R 1 > 1. 5.1.1 What happens if R 1 > R 2 > 1 but condition (39) is not satisfied?. We study a case where R 1 > R 2 > 1 but condition (39) is not satisfied. Although the condition is similar to that in Fig 6, we see here that the system converges to the strain 1 equilibrium point. That is, even though the reproduction numbers R 1 and R 2 are more than one, strain 2 still gets eradicated from the system on the long run while strain 1 caused an endemic. This study shows that the second inequality in condition (39) is necessary for the existence of endemic with more than one strain. We study an interesting case where although the reproduction number R 1 for strain 1 is greater than 1, the strain still gets eradicated from the system on the long run. This happens because a newer strain 2 caused an endemic with a higher reproduction number R 2 > R 1 > 1. This case is analyzed geometrically in Appendix B in S1 Appendix.
The estimated results for all compartments are shown in Fig 14.

Sensitivity analysis.
In this section, we study the impact of each epidemiological parameters on the disease transmission and prevalence. We are interested in discovering the parameters that have high impact on the basic reproduction number R k , k = 1, 2, � � �, n. This can be achieved using sensitivity analysis by calculating the sensitivity index of each parameters on R k . The normalized forward sensitivity index D u p of a variable F that depends differentiably on a parameter u is defined as [71,72] The analytic expression for the sensitivity index of R k , k = 1, 2, � � �, n, with respect to the . In this case, we set β 1 = 1.2, β 2 = 0.9, γ 1 = 0.5, γ 2 = 0.4, μ = 1/ 80.3, q 1 = 0.1, q 2 = 0.5, λ 1 = 1/5, λ 2 = 1/4, r 1 = 1/10.5, r 2 = 1/9, θ 1 = 0.09, θ 2 = 0.1. In this case, R 1 ¼ 2:81 and R 2 ¼ 2:45 so that R 0 ¼ 2:81. We see in this case that 1 þ , implying the existence of endemic with more than one strain. We see an endemic with both strains 1 and 2 because the population is already in an endemic state with strain 1 before strain 2 caused an endemic, and the number of secondary infection caused by strain 2 is more than 1 þ but not up to that caused by strain 1. parameters in (35) is obtained as This condition implies, from (38), that E þ 1 > 0 and E þ 2 < 0, so that the values I þ 1 , A þ 1 , and R þ 1 are positive but I þ 2 , A þ 2 , and R þ 2 are negative. This shows that only strain 1 endemic exists on the long run. A similar analysis is presented in Appendix B in S1 Appendix geometrically. https://doi.org/10.1371/journal.pone.0271446.g007 The positive sensitivity index with respect to β k , γ k , and λ k shows that an increase in the value of the strain k's symptomatic, asymptomatic transmission rates, and the latency rate leads to an increase in the basic reproduction number, R k , as expected. Likewise, the negative sensitivity index for μ, q l , l = k, k + 1, � � �, n, r k , and θ k shows that an increase in the natural death rate, vaccination rate, symptomatic and asymptomatic recovery rates, respectively, leads to a decrease in R k , as expected. Also, the zero value for the sensitivity index for q l , l = 1, 2, � � �, k − 1, shows that being vaccinated against predecessor strains l = 1, 2, � � �, k − 1 does not have any impact on the current and future strains j � k infection count. The sensitivity index D R k p is positive if g k a k > b k w k . That is, an increase in the fraction of infection cases that are asymptomatic will lead to an increase in the basic reproduction number, R k , if the total number of infection caused by an asymptomatic strain k infectious individual is more than that caused by a symptomatic strain k individual. The magnitude of the sensitivity of R k to changes in these parameters can be calculated by evaluating D R k u for each parameter estimates u in Tables 4 and 5 for  case where n = 1 and n = 2, respectively. For a particular parameter u, a higher value of D R k u suggests that R k is more sensitive to u. The sensitivity index of R k to each parameters estimated in Table 5 for model (35) is given in Table 6 for k = 1, 2.

Summary and discussion
In this work, a multi-variant epidemic model for analyzing the emergence and dissemination of viral multi-strains of an infectious disease in a population that is assumed to be completely susceptible to n different strains of the disease is developed and analyzed. A major assumption on the viral strains is that those who are vaccinated and recovered from a specific strain k � n are immune to that strain and its predecessors but can still be infected by newer emerging strains. A study of how well-poised the model is is carried out by showing the existence of non-negative solution of the derived model. The model compares the cases where the force of infection on the susceptible vaccinated and unvaccinated populations are different and the same. The reproduction number for each specific strains j = 1, 2, � � �, n is obtained and analyzed. We show that the reproduction number for the system where individuals who recovered

PLOS ONE
Analysis of multi-strain epidemic model: Application to COVID-19 from certain strain can be infected by emerging strain is the same for the system where such individuals cannot be re-infected by emerging strain. In order to shed more light on the possibility of an endemic with more than one strain of the virus, global stability analysis is obtained for different equilibrium points E S r , r = 0, 1, 2, � � �, n, of the system, with E S 0 denoting the disease-free equilibrium. It was shown with respect to model (35) that for an endemic with strains τ 1 � n and τ 2 � n (where τ 1 < τ 2 ) to occur, it must be that the reproduction number . This condition shows that for an endemic with strains τ 1 � n and τ 2 � n to occur, the population must have been in an endemic state with the first emerged strain τ 1 and the number of secondary infection caused by strain τ 1 must be greater than that of strain τ 2 , with a necessary condition that R t 2 > 1 þ . The importance of this necessary condition is emphasized numerically by showing that a system that only satisfies the condition R t 1 > R t 2 > 1 does not converge to the endemic equilibrium E S 2 with strains τ 1 and τ 2 , but instead converges to strain τ 1 equilibrium E t 1 . The later condition reduces to for the case where individuals who recovered from certain strain cannot be infected by emerging strain. A general condition is obtained for the case where endemic with strains τ 1 , τ 2 , � � �, τ r (with τ 1 < τ 2 < � � � < τ r ) exists in the population. Our analysis shows that for an endemic with strains τ 1 , τ 2 , � � �, τ r to exist, the population must first be in endemic with strain τ 1 , followed by τ 2 , � � �, τ r , so that condition (24) satisfied. Also, we showed numerically that a strain with a reproduction number greater than 1 can still die out on the long run if a newer emerging strain has a greater reproduction number. The effect of vaccines on the population is also analyzed. The herd immunity for each strain j = 1, 2, � � �, n is computed as a function of the effectiveness of vaccines against the strain. A threshold for the herd immunity for the case where there is endemic of more than one strain of the virus is also calculated. The result is applied to analyze real COVID-19 data for the Delta and Omicron variants in the United States. The reproduction numbers for the Delta and Omicron variants cases for the period 12.11.2021 to 01. 15.2022 suggest that the Delta variant cases is dying down, while there is an endemic of the Omicron variant. Using model (35), possible trajectories for the susceptible, vaccinated, exposed, infected and recovered population are plotted by first estimating the parameters in the model. The parameters are estimated by minimizing the

PLOS ONE
Analysis of multi-strain epidemic model: Application to  sum of square error between the real and estimated infection cases using the Nelder-Mead simplex algorithm [69]. Further research on the COVID-19 cases is ongoing and the results of the research will be made known once it is available. In the future, we plan to modify the model to fit emerging characteristics of the virus and extends limitations in the model to a more general case.   https://doi.org/10.1371/journal.pone.0271446.g014 Table 6. Sensitivity index of R k to parameters generated in Table 5  Analysis suggests that changes in the asymptomatic transmission rate contribute more to the changes in the reproduction number R k than the symptomatic transmission rate for the Delta (k = 1) and Omicron (k = 2) variants. Also, R k is more sensitive to changes in the infection rate λ k than the transmission rates and the fraction p k of cases that are asymptomatic. For the Delta variant, the reproduction number R 1 is more sensitive to the recovery rate θ 1 of the symptomatic class than r 1 .